

<!DOCTYPE html>
<html lang="zh-CN">

<head>
  <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
  <meta name="viewport" content="width=device-width, initial-scale=1, maximum-scale=1, user-scalable=no">
  <meta http-equiv="X-UA-Compatible" content="ie=edge">
  <title>用Appium+Python爬取朋友圈实习信息 - cxp&#39;s blog</title>
  <meta name="apple-mobile-web-app-capable" content="yes" />
  <meta name="apple-mobile-web-app-status-bar-style" content="black-translucent">
  <meta name="google" content="notranslate" />

  
  
  <meta name="description" content=" 前期准备工作

安装MongoDB  NoSQL 建..."> 
  
  <meta name="author" content="Alex"> 

  
    <link rel="icon" href="/images/icons/favicon-16x16.png" type="image/png" sizes="16x16">
  
  
    <link rel="icon" href="/images/icons/favicon-32x32.png" type="image/png" sizes="32x32">
  
  
    <link rel="apple-touch-icon" href="/images/icons/apple-touch-icon.png" sizes="180x180">
  
  
    <meta rel="mask-icon" href="/images/icons/stun-logo.svg" color="#333333">
  
  
    <meta rel="msapplication-TileImage" content="/images/icons/favicon-144x144.png">
    <meta rel="msapplication-TileColor" content="#000000">
  

  <link rel="stylesheet" href="/css/style.css">

  
  <link rel="stylesheet" href="//at.alicdn.com/t/font_1445822_h1619vhl1nr.css">
  

  
  
  <link rel="stylesheet" href="https://cdn.bootcss.com/fancybox/3.5.7/jquery.fancybox.min.css">
  

  
  
  <link rel="stylesheet" href="https://cdn.bootcss.com/highlight.js/9.18.1/styles/xcode.min.css">
  

  <script>
    var CONFIG = window.CONFIG || {};
    var ZHAOO = window.ZHAOO || {};
    CONFIG = {
      isHome: false,
      fancybox: true,
      pjax: false,
      lazyload: {
        enable: true,
        loadingImage: '',
      },
      donate: {
        enable: true,
        alipay: 'https://pic.izhaoo.com/alipay.jpg',
        wechat: 'https://pic.izhaoo.com/wechat.jpg'
      },
      motto: {
        api: '',
        default: '我在开了灯的床头下，想问问自己的心啊。'
      },
      galleries: {
        enable: true
      },
      fab: {
        enable: true,
        alwaysShow: false
      },
      carrier: {
        enable: true
      },
      daovoice: {
        enable: true
      }
    }
  </script>

  

  
<link rel="alternate" href="/atom.xml" title="cxp's blog" type="application/atom+xml">
</head>
<body class="lock-screen">
  <div class="loading"></div>
  


<nav class="navbar">
  <div class="left"></div>
  <div class="center">用Appium+Python爬取朋友圈实习信息</div>
  <div class="right">
    <i class="iconfont iconmenu j-navbar-menu"></i>
  </div>
</nav>

  <nav class="menu">
  <div class="menu-wrap">
    <div class="menu-close">
      <i class="iconfont iconbaseline-close-px"></i>
    </div>
    <ul class="menu-content">
      
      
      
      
      <li class="menu-item"><a href="/ " class="underline"> 首页</a></li>
      
      
      
      
      <li class="menu-item"><a href="/galleries " class="underline"> 摄影</a></li>
      
      
      
      
      <li class="menu-item"><a href="/archives " class="underline"> 归档</a></li>
      
      
      
      
      <li class="menu-item"><a href="/tags " class="underline"> 标签</a></li>
      
      
      
      
      <li class="menu-item"><a href="/categories " class="underline"> 分类</a></li>
      
      
      
      
      <li class="menu-item"><a href="/about " class="underline"> 关于</a></li>
      
    </ul>
    <div class="menu-copyright"><p>Powered by <a target="_blank" href="https://hexo.io">Hexo</a>  |  Theme - <a target="_blank" href="https://github.com/izhaoo/hexo-theme-zhaoo">zhaoo</a></p></div>
  </div>
</nav>
  <main id="main">
  <div class="container" id="container">
    <article class="article">
  <div class="wrap">
    <section class="head">
  <img   class="lazyload" data-original="/images/theme/post-image.jpg" src=""  draggable="false">
  <div class="head-mask">
    <h1 class="head-title">用Appium+Python爬取朋友圈实习信息</h1>
    <div class="head-info">
      <span class="post-info-item"><i class="iconfont iconcalendar"></i>三月 18, 2019</span
        class="post-info-item">
      
      <span class="post-info-item"><i class="iconfont iconfont-size"></i>5476</span>
    </div>
  </div>
</section>
    <section class="main">
      <section class="content">
        <h1 id="前期准备工作"><a class="markdownIt-Anchor" href="#前期准备工作"></a> 前期准备工作</h1>
<ul>
<li>安装MongoDB  NoSQL 建议最后安装可视化环境，不然会面临卡在安装界面很长时间。</li>
<li>安装pymongo模块，pip3安装</li>
<li>安装Android开发环境，下载Android Studio，并安装相应的安卓SDK版本</li>
<li>安装JDK,需要使用java命令</li>
<li>安装Appium，从Github中下载；另外还需要安装模块appium，后边python程序需要做接口，安装具体方法：<code>pip install Appium_Python_Client</code></li>
<li>使用过程中要电脑连接手机，打开手机调试开关。</li>
</ul>
<h1 id="一-关于appium"><a class="markdownIt-Anchor" href="#一-关于appium"></a> 一、关于Appium</h1>
<blockquote>
<p>Appium是一种移动端的自动化测试工具，和PC端使用的Selenium类似，该工具可以驱动Android、iOS等设备完成自动化测试。</p>
</blockquote>
<p>首先打开appium软件启动服务，相当于打开了一个Appium服务器，使用端口名称为：<code>4723</code></p>
<p>然后<code>Start Inspecter Session</code><br />
在进行如下配置：<br />
需要在用adb devices -l 获取你的手机设备名字(<code>model字段的内容</code>)</p>
<p><img   class="lazyload" data-original="q.jpg" src=""  alt="" /></p>
<p>配置好之后点击<code>Start Session</code><br />
进入微信登录界面如下，左边的窗口会实时显示手机屏幕信息，中间块显示每个块的信息，右边部分显示一些属性如<code>xpath和id</code>等。这个窗口是为了在后边编写代码时获取相应XPATH和ID的。</p>
<p><img   class="lazyload" data-original="1.png" src=""  alt="" /></p>
<p>按照爬取流程可能用到的依次是登录、密码、下一步、发现、朋友圈、每条朋友圈的昵称、内容和分享的文章名。<br />
对了会出现提示你是否打开手机通讯录，需要在Appium中实际操作下，记录否的按钮对应的元素ID。</p>
<h1 id="二-爬取朋友圈信息代码"><a class="markdownIt-Anchor" href="#二-爬取朋友圈信息代码"></a> 二、爬取朋友圈信息代码</h1>
<p>具体ID每个手机设备都不一样，需要自己在Appium中获取。</p>
<figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span></pre></td><td class="code"><pre><span class="line"><span class="keyword">import</span> os</span></pre></td></tr><tr><td class="gutter"><pre><span class="line">2</span></pre></td><td class="code"><pre><span class="line"><span class="keyword">from</span> appium <span class="keyword">import</span> webdriver</span></pre></td></tr><tr><td class="gutter"><pre><span class="line">3</span></pre></td><td class="code"><pre><span class="line"><span class="keyword">from</span> appium.webdriver.common.touch_action <span class="keyword">import</span> TouchAction</span></pre></td></tr><tr><td class="gutter"><pre><span class="line">4</span></pre></td><td class="code"><pre><span class="line"><span class="keyword">from</span> selenium.common.exceptions <span class="keyword">import</span> NoSuchElementException</span></pre></td></tr><tr><td class="gutter"><pre><span class="line">5</span></pre></td><td class="code"><pre><span class="line"><span class="keyword">from</span> selenium.webdriver.common.by <span class="keyword">import</span> By</span></pre></td></tr><tr><td class="gutter"><pre><span class="line">6</span></pre></td><td class="code"><pre><span class="line"><span class="keyword">from</span> selenium.webdriver.support.ui <span class="keyword">import</span> WebDriverWait</span></pre></td></tr><tr><td class="gutter"><pre><span class="line">7</span></pre></td><td class="code"><pre><span class="line"><span class="keyword">from</span> selenium.webdriver.support <span class="keyword">import</span> expected_conditions <span class="keyword">as</span> EC</span></pre></td></tr><tr><td class="gutter"><pre><span class="line">8</span></pre></td><td class="code"><pre><span class="line"><span class="keyword">from</span> pymongo <span class="keyword">import</span> MongoClient</span></pre></td></tr><tr><td class="gutter"><pre><span class="line">9</span></pre></td><td class="code"><pre><span class="line"><span class="keyword">from</span> time <span class="keyword">import</span> sleep</span></pre></td></tr><tr><td class="gutter"><pre><span class="line">10</span></pre></td><td class="code"><pre><span class="line"></span></pre></td></tr><tr><td class="gutter"><pre><span class="line">11</span></pre></td><td class="code"><pre><span class="line"><span class="comment"># 平台</span></span></pre></td></tr><tr><td class="gutter"><pre><span class="line">12</span></pre></td><td class="code"><pre><span class="line">PLATFORM = <span class="string">'Android'</span></span></pre></td></tr><tr><td class="gutter"><pre><span class="line">13</span></pre></td><td class="code"><pre><span class="line"></span></pre></td></tr><tr><td class="gutter"><pre><span class="line">14</span></pre></td><td class="code"><pre><span class="line"><span class="comment"># 手机设备名称</span></span></pre></td></tr><tr><td class="gutter"><pre><span class="line">15</span></pre></td><td class="code"><pre><span class="line">DEVICE_NAME = <span class="string">'TA_1054'</span></span></pre></td></tr><tr><td class="gutter"><pre><span class="line">16</span></pre></td><td class="code"><pre><span class="line"></span></pre></td></tr><tr><td class="gutter"><pre><span class="line">17</span></pre></td><td class="code"><pre><span class="line"><span class="comment"># APP包名</span></span></pre></td></tr><tr><td class="gutter"><pre><span class="line">18</span></pre></td><td class="code"><pre><span class="line">APP_PACKAGE = <span class="string">'com.tencent.mm'</span></span></pre></td></tr><tr><td class="gutter"><pre><span class="line">19</span></pre></td><td class="code"><pre><span class="line"></span></pre></td></tr><tr><td class="gutter"><pre><span class="line">20</span></pre></td><td class="code"><pre><span class="line"><span class="comment"># 入口类名</span></span></pre></td></tr><tr><td class="gutter"><pre><span class="line">21</span></pre></td><td class="code"><pre><span class="line">APP_ACTIVITY = <span class="string">'.ui.LauncherUI'</span></span></pre></td></tr><tr><td class="gutter"><pre><span class="line">22</span></pre></td><td class="code"><pre><span class="line"></span></pre></td></tr><tr><td class="gutter"><pre><span class="line">23</span></pre></td><td class="code"><pre><span class="line"><span class="comment"># Appium地址</span></span></pre></td></tr><tr><td class="gutter"><pre><span class="line">24</span></pre></td><td class="code"><pre><span class="line">DRIVER_SERVER = <span class="string">'http://localhost:4723/wd/hub'</span></span></pre></td></tr><tr><td class="gutter"><pre><span class="line">25</span></pre></td><td class="code"><pre><span class="line"><span class="comment"># 超时时间设置</span></span></pre></td></tr><tr><td class="gutter"><pre><span class="line">26</span></pre></td><td class="code"><pre><span class="line">TIMEOUT = <span class="number">500</span></span></pre></td></tr><tr><td class="gutter"><pre><span class="line">27</span></pre></td><td class="code"><pre><span class="line"></span></pre></td></tr><tr><td class="gutter"><pre><span class="line">28</span></pre></td><td class="code"><pre><span class="line"><span class="comment"># 微信手机号密码</span></span></pre></td></tr><tr><td class="gutter"><pre><span class="line">29</span></pre></td><td class="code"><pre><span class="line">USERNAME = <span class="string">'NUMBER'</span></span></pre></td></tr><tr><td class="gutter"><pre><span class="line">30</span></pre></td><td class="code"><pre><span class="line">PASSWORD = <span class="string">'PASSWD'</span></span></pre></td></tr><tr><td class="gutter"><pre><span class="line">31</span></pre></td><td class="code"><pre><span class="line"></span></pre></td></tr><tr><td class="gutter"><pre><span class="line">32</span></pre></td><td class="code"><pre><span class="line"><span class="comment"># 滑动位置和距离</span></span></pre></td></tr><tr><td class="gutter"><pre><span class="line">33</span></pre></td><td class="code"><pre><span class="line">FLICK_START_X = <span class="number">300</span></span></pre></td></tr><tr><td class="gutter"><pre><span class="line">34</span></pre></td><td class="code"><pre><span class="line">FLICK_START_Y = <span class="number">300</span></span></pre></td></tr><tr><td class="gutter"><pre><span class="line">35</span></pre></td><td class="code"><pre><span class="line">FLICK_DISTANCE = <span class="number">700</span></span></pre></td></tr><tr><td class="gutter"><pre><span class="line">36</span></pre></td><td class="code"><pre><span class="line"></span></pre></td></tr><tr><td class="gutter"><pre><span class="line">37</span></pre></td><td class="code"><pre><span class="line"><span class="comment"># MongoDB配置</span></span></pre></td></tr><tr><td class="gutter"><pre><span class="line">38</span></pre></td><td class="code"><pre><span class="line">MONGO_URL = <span class="string">'localhost'</span></span></pre></td></tr><tr><td class="gutter"><pre><span class="line">39</span></pre></td><td class="code"><pre><span class="line">MONGO_DB = <span class="string">'moments'</span></span></pre></td></tr><tr><td class="gutter"><pre><span class="line">40</span></pre></td><td class="code"><pre><span class="line">MONGO_COLLECTION = <span class="string">'moments'</span></span></pre></td></tr><tr><td class="gutter"><pre><span class="line">41</span></pre></td><td class="code"><pre><span class="line"></span></pre></td></tr><tr><td class="gutter"><pre><span class="line">42</span></pre></td><td class="code"><pre><span class="line"><span class="comment"># 滑动间隔</span></span></pre></td></tr><tr><td class="gutter"><pre><span class="line">43</span></pre></td><td class="code"><pre><span class="line">SCROLL_SLEEP_TIME = <span class="number">2</span></span></pre></td></tr><tr><td class="gutter"><pre><span class="line">44</span></pre></td><td class="code"><pre><span class="line"></span></pre></td></tr><tr><td class="gutter"><pre><span class="line">45</span></pre></td><td class="code"><pre><span class="line"><span class="class"><span class="keyword">class</span> <span class="title">Moments</span><span class="params">()</span>:</span></span></pre></td></tr><tr><td class="gutter"><pre><span class="line">46</span></pre></td><td class="code"><pre><span class="line">    <span class="function"><span class="keyword">def</span> <span class="title">__init__</span><span class="params">(self)</span>:</span></span></pre></td></tr><tr><td class="gutter"><pre><span class="line">47</span></pre></td><td class="code"><pre><span class="line">        <span class="string">"""</span></span></pre></td></tr><tr><td class="gutter"><pre><span class="line">48</span></pre></td><td class="code"><pre><span class="line"><span class="string">        初始化</span></span></pre></td></tr><tr><td class="gutter"><pre><span class="line">49</span></pre></td><td class="code"><pre><span class="line"><span class="string">        """</span></span></pre></td></tr><tr><td class="gutter"><pre><span class="line">50</span></pre></td><td class="code"><pre><span class="line">        <span class="comment"># 驱动配置</span></span></pre></td></tr><tr><td class="gutter"><pre><span class="line">51</span></pre></td><td class="code"><pre><span class="line">        self.desired_caps = &#123;</span></pre></td></tr><tr><td class="gutter"><pre><span class="line">52</span></pre></td><td class="code"><pre><span class="line">            <span class="string">'platformName'</span>: PLATFORM,</span></pre></td></tr><tr><td class="gutter"><pre><span class="line">53</span></pre></td><td class="code"><pre><span class="line">            <span class="string">'deviceName'</span>: DEVICE_NAME,</span></pre></td></tr><tr><td class="gutter"><pre><span class="line">54</span></pre></td><td class="code"><pre><span class="line">            <span class="string">'appPackage'</span>: APP_PACKAGE,</span></pre></td></tr><tr><td class="gutter"><pre><span class="line">55</span></pre></td><td class="code"><pre><span class="line">            <span class="string">'appActivity'</span>: APP_ACTIVITY</span></pre></td></tr><tr><td class="gutter"><pre><span class="line">56</span></pre></td><td class="code"><pre><span class="line">        &#125;</span></pre></td></tr><tr><td class="gutter"><pre><span class="line">57</span></pre></td><td class="code"><pre><span class="line">        self.driver = webdriver.Remote(DRIVER_SERVER, self.desired_caps)</span></pre></td></tr><tr><td class="gutter"><pre><span class="line">58</span></pre></td><td class="code"><pre><span class="line">        self.wait = WebDriverWait(self.driver, TIMEOUT)</span></pre></td></tr><tr><td class="gutter"><pre><span class="line">59</span></pre></td><td class="code"><pre><span class="line">        self.client = MongoClient(MONGO_URL)</span></pre></td></tr><tr><td class="gutter"><pre><span class="line">60</span></pre></td><td class="code"><pre><span class="line">        self.db = self.client[MONGO_DB]</span></pre></td></tr><tr><td class="gutter"><pre><span class="line">61</span></pre></td><td class="code"><pre><span class="line">        self.collection = self.db[MONGO_COLLECTION]</span></pre></td></tr><tr><td class="gutter"><pre><span class="line">62</span></pre></td><td class="code"><pre><span class="line"></span></pre></td></tr><tr><td class="gutter"><pre><span class="line">63</span></pre></td><td class="code"><pre><span class="line">    <span class="function"><span class="keyword">def</span> <span class="title">login</span><span class="params">(self)</span>:</span></span></pre></td></tr><tr><td class="gutter"><pre><span class="line">64</span></pre></td><td class="code"><pre><span class="line">        <span class="string">"""</span></span></pre></td></tr><tr><td class="gutter"><pre><span class="line">65</span></pre></td><td class="code"><pre><span class="line"><span class="string">        登录微信</span></span></pre></td></tr><tr><td class="gutter"><pre><span class="line">66</span></pre></td><td class="code"><pre><span class="line"><span class="string">        :return:</span></span></pre></td></tr><tr><td class="gutter"><pre><span class="line">67</span></pre></td><td class="code"><pre><span class="line"><span class="string">        """</span></span></pre></td></tr><tr><td class="gutter"><pre><span class="line">68</span></pre></td><td class="code"><pre><span class="line">        <span class="comment"># 登录按钮</span></span></pre></td></tr><tr><td class="gutter"><pre><span class="line">69</span></pre></td><td class="code"><pre><span class="line">        print(<span class="string">'开始登陆'</span>)</span></pre></td></tr><tr><td class="gutter"><pre><span class="line">70</span></pre></td><td class="code"><pre><span class="line">        login = self.wait.until(EC.presence_of_element_located((By.ID, <span class="string">'com.tencent.mm:id/e4g'</span>)))</span></pre></td></tr><tr><td class="gutter"><pre><span class="line">71</span></pre></td><td class="code"><pre><span class="line">        login.click()</span></pre></td></tr><tr><td class="gutter"><pre><span class="line">72</span></pre></td><td class="code"><pre><span class="line">        print(<span class="string">'点击登录了'</span>)</span></pre></td></tr><tr><td class="gutter"><pre><span class="line">73</span></pre></td><td class="code"><pre><span class="line">        <span class="comment"># 手机号</span></span></pre></td></tr><tr><td class="gutter"><pre><span class="line">74</span></pre></td><td class="code"><pre><span class="line">        phone = self.wait.until(EC.presence_of_element_located((By.ID, <span class="string">'com.tencent.mm:id/kh'</span>)))</span></pre></td></tr><tr><td class="gutter"><pre><span class="line">75</span></pre></td><td class="code"><pre><span class="line">        phone.set_text(USERNAME)</span></pre></td></tr><tr><td class="gutter"><pre><span class="line">76</span></pre></td><td class="code"><pre><span class="line">        <span class="comment"># 下一步</span></span></pre></td></tr><tr><td class="gutter"><pre><span class="line">77</span></pre></td><td class="code"><pre><span class="line">        next = self.wait.until(EC.element_to_be_clickable((By.ID, <span class="string">'com.tencent.mm:id/axt'</span>)))</span></pre></td></tr><tr><td class="gutter"><pre><span class="line">78</span></pre></td><td class="code"><pre><span class="line">        next.click()</span></pre></td></tr><tr><td class="gutter"><pre><span class="line">79</span></pre></td><td class="code"><pre><span class="line">        <span class="comment"># 密码</span></span></pre></td></tr><tr><td class="gutter"><pre><span class="line">80</span></pre></td><td class="code"><pre><span class="line">        password = self.wait.until(</span></pre></td></tr><tr><td class="gutter"><pre><span class="line">81</span></pre></td><td class="code"><pre><span class="line">            EC.presence_of_element_located((By.XPATH, <span class="string">'//*[@resource-id="com.tencent.mm:id/kh"][1]'</span>)))</span></pre></td></tr><tr><td class="gutter"><pre><span class="line">82</span></pre></td><td class="code"><pre><span class="line">        password.set_text(PASSWORD)</span></pre></td></tr><tr><td class="gutter"><pre><span class="line">83</span></pre></td><td class="code"><pre><span class="line">        <span class="comment"># 提交</span></span></pre></td></tr><tr><td class="gutter"><pre><span class="line">84</span></pre></td><td class="code"><pre><span class="line">        submit = self.wait.until(EC.element_to_be_clickable((By.ID, <span class="string">'com.tencent.mm:id/axt'</span>)))</span></pre></td></tr><tr><td class="gutter"><pre><span class="line">85</span></pre></td><td class="code"><pre><span class="line">        submit.click()</span></pre></td></tr><tr><td class="gutter"><pre><span class="line">86</span></pre></td><td class="code"><pre><span class="line">        submit = self.wait.until(EC.element_to_be_clickable((By.ID, <span class="string">'com.tencent.mm:id/az9'</span>)))</span></pre></td></tr><tr><td class="gutter"><pre><span class="line">87</span></pre></td><td class="code"><pre><span class="line">        submit.click()</span></pre></td></tr><tr><td class="gutter"><pre><span class="line">88</span></pre></td><td class="code"><pre><span class="line"></span></pre></td></tr><tr><td class="gutter"><pre><span class="line">89</span></pre></td><td class="code"><pre><span class="line">    <span class="function"><span class="keyword">def</span> <span class="title">enter</span><span class="params">(self)</span>:</span></span></pre></td></tr><tr><td class="gutter"><pre><span class="line">90</span></pre></td><td class="code"><pre><span class="line">        <span class="string">"""</span></span></pre></td></tr><tr><td class="gutter"><pre><span class="line">91</span></pre></td><td class="code"><pre><span class="line"><span class="string">        进入朋友圈</span></span></pre></td></tr><tr><td class="gutter"><pre><span class="line">92</span></pre></td><td class="code"><pre><span class="line"><span class="string">        :return:</span></span></pre></td></tr><tr><td class="gutter"><pre><span class="line">93</span></pre></td><td class="code"><pre><span class="line"><span class="string">        """</span></span></pre></td></tr><tr><td class="gutter"><pre><span class="line">94</span></pre></td><td class="code"><pre><span class="line">        <span class="comment"># 点击发现</span></span></pre></td></tr><tr><td class="gutter"><pre><span class="line">95</span></pre></td><td class="code"><pre><span class="line">        tab = self.wait.until(</span></pre></td></tr><tr><td class="gutter"><pre><span class="line">96</span></pre></td><td class="code"><pre><span class="line">            EC.presence_of_element_located((By.ID, <span class="string">' '</span>)))</span></pre></td></tr><tr><td class="gutter"><pre><span class="line">97</span></pre></td><td class="code"><pre><span class="line">        tab.click()</span></pre></td></tr><tr><td class="gutter"><pre><span class="line">98</span></pre></td><td class="code"><pre><span class="line">        <span class="comment"># 朋友圈</span></span></pre></td></tr><tr><td class="gutter"><pre><span class="line">99</span></pre></td><td class="code"><pre><span class="line">        moments = self.wait.until(EC.presence_of_element_located((By.ID, <span class="string">' '</span>)))</span></pre></td></tr><tr><td class="gutter"><pre><span class="line">100</span></pre></td><td class="code"><pre><span class="line">        moments.click()</span></pre></td></tr><tr><td class="gutter"><pre><span class="line">101</span></pre></td><td class="code"><pre><span class="line"></span></pre></td></tr><tr><td class="gutter"><pre><span class="line">102</span></pre></td><td class="code"><pre><span class="line">    <span class="function"><span class="keyword">def</span> <span class="title">crawl</span><span class="params">(self)</span>:</span></span></pre></td></tr><tr><td class="gutter"><pre><span class="line">103</span></pre></td><td class="code"><pre><span class="line">        <span class="string">"""</span></span></pre></td></tr><tr><td class="gutter"><pre><span class="line">104</span></pre></td><td class="code"><pre><span class="line"><span class="string">        爬取</span></span></pre></td></tr><tr><td class="gutter"><pre><span class="line">105</span></pre></td><td class="code"><pre><span class="line"><span class="string">        :return:</span></span></pre></td></tr><tr><td class="gutter"><pre><span class="line">106</span></pre></td><td class="code"><pre><span class="line"><span class="string">        """</span></span></pre></td></tr><tr><td class="gutter"><pre><span class="line">107</span></pre></td><td class="code"><pre><span class="line">        <span class="keyword">while</span> <span class="literal">True</span>:</span></pre></td></tr><tr><td class="gutter"><pre><span class="line">108</span></pre></td><td class="code"><pre><span class="line">            <span class="comment"># 当前页面显示的所有状态</span></span></pre></td></tr><tr><td class="gutter"><pre><span class="line">109</span></pre></td><td class="code"><pre><span class="line">            items = self.wait.until(</span></pre></td></tr><tr><td class="gutter"><pre><span class="line">110</span></pre></td><td class="code"><pre><span class="line">                EC.presence_of_all_elements_located(</span></pre></td></tr><tr><td class="gutter"><pre><span class="line">111</span></pre></td><td class="code"><pre><span class="line">                    (By.XPATH, <span class="string">'//*[@resource-id="com.tencent.mm:id/avi"]//android.widget.FrameLayout'</span>)))</span></pre></td></tr><tr><td class="gutter"><pre><span class="line">112</span></pre></td><td class="code"><pre><span class="line">            <span class="comment"># 上滑</span></span></pre></td></tr><tr><td class="gutter"><pre><span class="line">113</span></pre></td><td class="code"><pre><span class="line">            self.driver.swipe(FLICK_START_X, FLICK_START_Y + FLICK_DISTANCE, FLICK_START_X, FLICK_START_Y)</span></pre></td></tr><tr><td class="gutter"><pre><span class="line">114</span></pre></td><td class="code"><pre><span class="line">            <span class="comment"># 遍历每条状态</span></span></pre></td></tr><tr><td class="gutter"><pre><span class="line">115</span></pre></td><td class="code"><pre><span class="line">            <span class="keyword">for</span> item <span class="keyword">in</span> items:</span></pre></td></tr><tr><td class="gutter"><pre><span class="line">116</span></pre></td><td class="code"><pre><span class="line">                <span class="keyword">try</span>:</span></pre></td></tr><tr><td class="gutter"><pre><span class="line">117</span></pre></td><td class="code"><pre><span class="line">                    <span class="comment"># 昵称</span></span></pre></td></tr><tr><td class="gutter"><pre><span class="line">118</span></pre></td><td class="code"><pre><span class="line">                    nickname = item.find_element_by_id(<span class="string">'com.tencent.mm:id/b5o'</span>).get_attribute(<span class="string">'text'</span>)</span></pre></td></tr><tr><td class="gutter"><pre><span class="line">119</span></pre></td><td class="code"><pre><span class="line">                    <span class="comment"># 正文</span></span></pre></td></tr><tr><td class="gutter"><pre><span class="line">120</span></pre></td><td class="code"><pre><span class="line">                    content = item.find_element_by_id(<span class="string">'com.tencent.mm:id/ejc'</span>).get_attribute(<span class="string">'text'</span>)</span></pre></td></tr><tr><td class="gutter"><pre><span class="line">121</span></pre></td><td class="code"><pre><span class="line">                    <span class="comment"># 获取分享的内容</span></span></pre></td></tr><tr><td class="gutter"><pre><span class="line">122</span></pre></td><td class="code"><pre><span class="line">                    share = item.find_element_by_id(<span class="string">'com.tencent.mm:id/czg'</span>).get_attribute(<span class="string">'text'</span>)</span></pre></td></tr><tr><td class="gutter"><pre><span class="line">123</span></pre></td><td class="code"><pre><span class="line">                    <span class="comment">#创建存储内容</span></span></pre></td></tr><tr><td class="gutter"><pre><span class="line">124</span></pre></td><td class="code"><pre><span class="line">                    data = &#123;</span></pre></td></tr><tr><td class="gutter"><pre><span class="line">125</span></pre></td><td class="code"><pre><span class="line">                        <span class="string">'nickname'</span>: nickname,</span></pre></td></tr><tr><td class="gutter"><pre><span class="line">126</span></pre></td><td class="code"><pre><span class="line">                        <span class="string">'content'</span>: content,</span></pre></td></tr><tr><td class="gutter"><pre><span class="line">127</span></pre></td><td class="code"><pre><span class="line">                        <span class="string">'share'</span>: share,</span></pre></td></tr><tr><td class="gutter"><pre><span class="line">128</span></pre></td><td class="code"><pre><span class="line">                    &#125;</span></pre></td></tr><tr><td class="gutter"><pre><span class="line">129</span></pre></td><td class="code"><pre><span class="line">                    <span class="comment"># 插入MongoDB</span></span></pre></td></tr><tr><td class="gutter"><pre><span class="line">130</span></pre></td><td class="code"><pre><span class="line">                    self.collection.update(&#123;<span class="string">'nickname'</span>: nickname, <span class="string">'content'</span>: content, <span class="string">'share'</span>: share&#125;, &#123;<span class="string">'$set'</span>: data&#125;, <span class="literal">True</span>)</span></pre></td></tr><tr><td class="gutter"><pre><span class="line">131</span></pre></td><td class="code"><pre><span class="line">                    sleep(SCROLL_SLEEP_TIME)</span></pre></td></tr><tr><td class="gutter"><pre><span class="line">132</span></pre></td><td class="code"><pre><span class="line">                <span class="keyword">except</span> NoSuchElementException:</span></pre></td></tr><tr><td class="gutter"><pre><span class="line">133</span></pre></td><td class="code"><pre><span class="line">                    <span class="keyword">pass</span></span></pre></td></tr><tr><td class="gutter"><pre><span class="line">134</span></pre></td><td class="code"><pre><span class="line"></span></pre></td></tr><tr><td class="gutter"><pre><span class="line">135</span></pre></td><td class="code"><pre><span class="line">    <span class="function"><span class="keyword">def</span> <span class="title">main</span><span class="params">(self)</span>:</span></span></pre></td></tr><tr><td class="gutter"><pre><span class="line">136</span></pre></td><td class="code"><pre><span class="line">        <span class="string">"""</span></span></pre></td></tr><tr><td class="gutter"><pre><span class="line">137</span></pre></td><td class="code"><pre><span class="line"><span class="string">        入口</span></span></pre></td></tr><tr><td class="gutter"><pre><span class="line">138</span></pre></td><td class="code"><pre><span class="line"><span class="string">        :return:</span></span></pre></td></tr><tr><td class="gutter"><pre><span class="line">139</span></pre></td><td class="code"><pre><span class="line"><span class="string">        """</span></span></pre></td></tr><tr><td class="gutter"><pre><span class="line">140</span></pre></td><td class="code"><pre><span class="line">        <span class="comment"># 登录</span></span></pre></td></tr><tr><td class="gutter"><pre><span class="line">141</span></pre></td><td class="code"><pre><span class="line">        self.login()</span></pre></td></tr><tr><td class="gutter"><pre><span class="line">142</span></pre></td><td class="code"><pre><span class="line">        time.sleep(<span class="number">25</span>)</span></pre></td></tr><tr><td class="gutter"><pre><span class="line">143</span></pre></td><td class="code"><pre><span class="line">        <span class="comment"># 进入朋友圈</span></span></pre></td></tr><tr><td class="gutter"><pre><span class="line">144</span></pre></td><td class="code"><pre><span class="line">        self.enter()</span></pre></td></tr><tr><td class="gutter"><pre><span class="line">145</span></pre></td><td class="code"><pre><span class="line">        time.sleep(<span class="number">15</span>)</span></pre></td></tr><tr><td class="gutter"><pre><span class="line">146</span></pre></td><td class="code"><pre><span class="line">        <span class="comment"># 爬取</span></span></pre></td></tr><tr><td class="gutter"><pre><span class="line">147</span></pre></td><td class="code"><pre><span class="line">        self.crawl()</span></pre></td></tr><tr><td class="gutter"><pre><span class="line">148</span></pre></td><td class="code"><pre><span class="line"></span></pre></td></tr><tr><td class="gutter"><pre><span class="line">149</span></pre></td><td class="code"><pre><span class="line"><span class="keyword">if</span> __name__ == <span class="string">'__main__'</span>:</span></pre></td></tr><tr><td class="gutter"><pre><span class="line">150</span></pre></td><td class="code"><pre><span class="line">    moments = Moments()</span></pre></td></tr><tr><td class="gutter"><pre><span class="line">151</span></pre></td><td class="code"><pre><span class="line">    time.sleep(<span class="number">6</span>)</span></pre></td></tr><tr><td class="gutter"><pre><span class="line">152</span></pre></td><td class="code"><pre><span class="line">    moments.main()</span></pre></td></tr></table></figure>
<p><strong>使用Mongo保存的数据，可以通过打开MongoDB Compass打开数据库查看数据</strong></p>
<p><img   class="lazyload" data-original="2.png" src=""  alt="" /></p>
<p>这样的话，在我的朋友圈有个微信号实时发送实习信息，爬取后可以通过数据库检索，获取所有关于实习的信息。或者其他有趣的信息。</p>
<blockquote>
<p>问题：从朋友圈中无法能够提取朋友发送朋友圈的时间信息，还需要在找资料解决。</p>
</blockquote>

      </section>
      <section class="extra">
        
        <ul class="copyright">
  
  <li><strong>本文作者：</strong>Alex</li>
  <li><strong>本文链接：</strong><a href="https://cxpeng.cn/archives/45d22ee3.html">https://cxpeng.cn/archives/45d22ee3.html</a></li>
  <li><strong>版权声明：</strong>本博客所有文章均采用<a href="https://creativecommons.org/licenses/by-nc-sa/4.0/deed.zh"
      rel="external nofollow" target="_blank"> BY-NC-SA </a>许可协议，转载请注明出处！</li>
  
</ul>
        
        
        <section class="donate">
  <div class="qrcode">
    <img   class="lazyload" data-original="https://pic.izhaoo.com/alipay.jpg" src="" >
  </div>
  <div class="icon">
    <a href="javascript:;" target="_blank" rel="noopener" id="alipay"><i class="iconfont iconalipay"></i></a>
    <a href="javascript:;" target="_blank" rel="noopener" id="wechat"><i class="iconfont iconwechat-fill"></i></a>
  </div>
</section>
        
        
  <ul class="tag-list" itemprop="keywords"><li class="tag-list-item"><a class="tag-list-link" href="/tags/Python/" rel="tag">Python</a></li><li class="tag-list-item"><a class="tag-list-link" href="/tags/%E7%88%AC%E8%99%AB/" rel="tag">爬虫</a></li></ul>

        
<nav class="nav">
  
    <a href="/archives/9c2ffa7.html"><i class="iconfont iconleft"></i>数据分析与挖掘（一）</a>
  
  
    <a href="/archives/270bf2b5.html">PYTHON编程：从入门到实践之数据可视化<i class="iconfont iconright"></i></a>
  
</nav>

      </section>
      
      <section class="comments">
  
  <div class="btn" id="comments-btn">查看评论</div>
  
  
</section>
      
    </section>
  </div>
</article>
  </div>
</main>
  <footer class="footer">
  <div class="footer-social">
    
    
    
    
    
    <a href="tencent://message/?Menu=yes&uin=894519210 " target="_blank" onMouseOver="this.style.color= '#12B7F5'"
      onMouseOut="this.style.color='#33333D'">
      <i class="iconfont footer-social-item  iconQQ "></i>
    </a>
    
    
    
    
    
    <a href="javascript:; " target="_blank" onMouseOver="this.style.color= '#09BB07'"
      onMouseOut="this.style.color='#33333D'">
      <i class="iconfont footer-social-item  iconwechat-fill "></i>
    </a>
    
    
    
    
    
    <a href="https://www.instagram.com/izhaoo/ " target="_blank" onMouseOver="this.style.color= '#DA2E76'"
      onMouseOut="this.style.color='#33333D'">
      <i class="iconfont footer-social-item  iconinstagram "></i>
    </a>
    
    
    
    
    
    <a href="https://github.com/izhaoo " target="_blank" onMouseOver="this.style.color= '#24292E'"
      onMouseOut="this.style.color='#33333D'">
      <i class="iconfont footer-social-item  icongithub-fill "></i>
    </a>
    
    
    
    
    
    <a href="mailto:izhaoo@163.com " target="_blank" onMouseOver="this.style.color='#FFBE5B'"
      onMouseOut="this.style.color='#33333D'">
      <i class="iconfont footer-social-item  iconmail"></i>
    </a>
    
  </div>
  <div class="footer-copyright"><p>Powered by <a target="_blank" href="https://hexo.io">Hexo</a>  |  Theme - <a target="_blank" href="https://github.com/izhaoo/hexo-theme-zhaoo">zhaoo</a></p></div>
</footer>
  
      <div class="fab fab-plus">
    <i class="iconfont iconplus"></i>
  </div>
  
  <div class="fab fab-daovoice">
    <i class="iconfont iconcomment"></i>
  </div>
  
  <div class="fab fab-up">
    <i class="iconfont iconcaret-up"></i>
  </div>
  
<script type="text/x-mathjax-config">
    MathJax.Hub.Config({
        tex2jax: {
            inlineMath: [ ["$","$"], ["\\(","\\)"] ],
            skipTags: ['script', 'noscript', 'style', 'textarea', 'pre', 'code'],
            processEscapes: true
        }
    });
    MathJax.Hub.Queue(function() {
        var all = MathJax.Hub.getAllJax();
        for (var i = 0; i < all.length; ++i)
            all[i].SourceElement().parentNode.className += ' has-jax';
    });
</script>
<script src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.1/MathJax.js?config=TeX-MML-AM_CHTML"></script>
</body>

<script src="https://cdn.bootcss.com/jquery/3.4.1/jquery.min.js"></script>




<script src="https://cdn.bootcdn.net/ajax/libs/jquery.lazyload/1.9.1/jquery.lazyload.min.js"></script>




<script src="https://cdn.bootcss.com/fancybox/3.5.7/jquery.fancybox.min.js"></script>




<script src="/js/utils.js"></script>
<script src="/js/modules.js"></script>
<script src="/js/zui.js"></script>
<script src="/js/script.js"></script>




<script>
  (function (i, s, o, g, r, a, m) {
    i["DaoVoiceObject"] = r;
    i[r] = i[r] || function () {
      (i[r].q = i[r].q || []).push(arguments)
    }, i[r].l = 1 * new Date();
    a = s.createElement(o), m = s.getElementsByTagName(o)[0];
    a.async = 1;
    a.src = g;
    a.charset = "utf-8";
    m.parentNode.insertBefore(a, m)
  })(window, document, "script", ('https:' == document.location.protocol ? 'https:' : 'http:') +
    "//widget.daovoice.io/widget/0f81ff2f.js", "daovoice")
  daovoice('init', {
    app_id: "abcdefg"
  }, {
    launcher: {
      disableLauncherIcon: true,
    },
  });
  daovoice('update');
</script>



<script>
  (function () {
    var bp = document.createElement('script');
    var curProtocol = window.location.protocol.split(':')[0];
    if (curProtocol === 'https') {
      bp.src = 'https://zz.bdstatic.com/linksubmit/push.js';
    } else {
      bp.src = 'http://push.zhanzhang.baidu.com/push.js';
    }
    var s = document.getElementsByTagName("script")[0];
    s.parentNode.insertBefore(bp, s);
  })();
</script>


<script>
  var _hmt = _hmt || [];
  (function () {
    var hm = document.createElement("script");
    hm.src = "https://hm.baidu.com/hm.js?4c204d8bc027a0455b5fc642ac334ca8";
    var s = document.getElementsByTagName("script")[0];
    s.parentNode.insertBefore(hm, s);
  })();
</script>










</html>